The aggregating algorithm and regression

نویسنده

  • Steven Busuttil
چکیده

Our main interest is in the problem of making predictions in the online mode of learning where at every step in time a signal arrives and a prediction needs to be made before the corresponding outcome arrives. Loss is suffered if the prediction and outcome do not match perfectly. In the prediction with expert advice framework, this protocol is augmented by a pool of experts that produce their predictions before we have to make ours. The Aggregating Algorithm (AA) is a technique that optimally merges these experts so that the resulting strategy suffers a cumulative loss that is almost as good as that of the best expert in the pool. The AA was applied to the problem of regression, where outcomes are continuous real numbers, to get the AA for Regression (AAR) and its kernel version, KAAR. On typical datasets, KAAR’s empirical performance is not as good as that of Kernel Ridge Regression (KRR) which is a popular regression method. KAAR performs better than KRR only when the data is corrupted with lots of noise or contains severe outliers. To alleviate this we introduce methods that are a hybrid between KRR and KAAR. Empirical experiments suggest that, in general, these new methods perform as good as or better than both KRR and KAAR. In the second part of this dissertation we deal with a more difficult problem — we allow the dependence of outcomes on signals to change with time. To handle this we propose two new methods: WeCKAAR and KAARCh. WeCKAAR is a simple modification of one of our methods from the first part of the dissertation to include decaying weights. KAARCh is an application of the AA to the case where the experts are all the predictors that can change with time. We show that KAARCh suffers a cumulative loss that is almost as good as that of any expert that does not change very rapidly. Empirical results on data with changing dependencies demonstrate that WeCKAAR and KAARCh perform well in practice and are considerably better than Kernel Ridge Regression.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Competitive On-line Linear Regression

We apply a general algorithm for merging prediction strategies (the Aggregating Algorithm) to the problem of linear regression with the square loss; our main assumption is that the response variable is bounded. It turns out that for this particular problem the Aggregating Algorithm resembles, but is slightly different from, the wellknown ridge estimation procedure. From general results about th...

متن کامل

An Upper Bound for Aggregating Algorithm for Regression with Changing Dependencies

The paper presents a competitive prediction-style upper bound on the square loss of the Aggregating Algorithm for Regression with Changing Dependencies in the linear case. The algorithm is able to compete with a sequence of linear predictors provided the sum of squared Euclidean norms of differences of regression coefficient vectors grows at a sublinear rate.

متن کامل

Aggregated Estimators and Empirical Complexity for Least Square Regression

Numerous empirical results have shown that combining regression procedures can be a very efficient method. This work provides PAC bounds for the L2 generalization error of such methods. The interest of these bounds are twofold. First, it gives for any aggregating procedure a bound for the expected risk depending on the empirical risk and the empirical complexity measured by the Kullback-Leibler...

متن کامل

Prediction with Expert Advice under Discounted Loss

We study prediction with expert advice in the setting where the losses are accumulated with some discounting and the impact of old losses can gradually vanish. We generalize the Aggregating Algorithm and the Aggregating Algorithm for Regression, propose a new variant of exponentially weighted average algorithm, and prove bounds on the cumulative discounted loss.

متن کامل

Effective Learning to Rank Persian Web Content

Persian language is one of the most widely used languages in the Web environment. Hence, the Persian Web includes invaluable information that is required to be retrieved effectively. Similar to other languages, ranking algorithms for the Persian Web content, deal with different challenges, such as applicability issues in real-world situations as well as the lack of user modeling. CF-Rank, as a ...

متن کامل

On-line Prediction with Kernels and the Complexity Approximation Principle

The paper describes an application of Aggregating Algorithm to the problem of regression. It generalizes earlier results concerned with plain linear regression to kernel techniques and presents an on-line algorithm which performs nearly as well as any oblivious kernel predictor. The paper contains the derivation of an estimate on the performance of this algorithm. The estimate is then used to d...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008